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ABSTRACT 

21 cm tomography is emerging as a promising probe of the cosmological dark ages 
and the epoch of reionization, as well as a tool for observational cosmology in gen- 
eral. However, serious sources of foreground contamination must be subtracted for 
experimental efforts to be viable. In this paper, we focus on the removal of unre- 
solved extragalactic point sources with smooth spectra, and evaluate how the residual 
foreground contamination after cleaning depends on instrumental and algorithmic pa- 
rameters. A crucial but often ignored complication is that the synthesized beam of an 
interferometer array shrinks towards higher frequency, causing complicated frequency 
structure in each sky pixel as "frizz" far from the beam centre contracts across unre- 
solved radio sources. We find that current-generation experiments should nonetheless 
be able to clean out this points source contamination adequately, and quantify the 
instrumental and algorithmic design specifications required to meet this foreground 
challenge. 

Key words: Cosmology: Early Universe - Radio Lines: General - Techniques: Inter- 
ferometric ~ Methods: Data Analysis 



1 INTRODUCTION 



The use of the 21 cm line of neutral hydrogen as a probe of 
cosmological structure has generated much excitement over 
the last few years. 21 cm tomography promises to place ex- 



tremely tight constraints on cosmological parameters (Mc- 



QumnetaLj2006; Sa ntos fc Cooray 2006; Wyithe et al.|2007| 
Bowman et aL)2007| |Mao et al.|200 8,), as well as to act as a 



direct probe of the cosmological dark ages and the Epoch of 



Reionization (EOR) ( 


Madau et al.|1997 |Tozzi et al. 


2000a|b| 


Iliev et al.|2002||Furlanetto et al.|2004||Loeb & Za 


darriaga 


2004| Furlanetto et al.|2004| Barkana & Loeb|2005| 


Mack & 


Wesley||2008). As a result, many 21 cm experiments in the 



form of large radio arrays have recently been funded and are 
currently being built, e.g., LOFAsQ 21CMA[3 MWA^and 

papebEI 

The observational challenges, however, are daunting. At 
radio frequencies, one must deal with a long list of con- 
taminants that have the ability to swamp the cosmological 
21 cm signal in collected data. Terrestrially, for instance, one 
must contend with radio frequency interference from radio 
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Figure 1. Sample synthesized beam profile for a typical 21-cm 
tomography experiment with 500 antenna tiles. The tiles are dis- 



tributed within a diameter Da 



1500 m according to the 



density function p(r) ~ r~^. Prom the profile, it is clear that a 
realistic evaluation of foreground subtraction techniques should 
take into account the fact that beam widths vary as \/D. This 
is particularly important when subtracting off unresolved point 
sources, because a point source can fall on an off-beam "spike" 
that has an effectively unpredictable dependence on frequency. 
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communications. At these wavelengths, the ionosphere also 
interacts significantly with incoming photons, which may 
pose a challenge for instrumental calibration. Perhaps most 



worrisome of all are astronomical foreground sources ( 


Oh& 


Mack 


20031 ISantos et al.|2005| [Morales et al.|2006||Bowman 


2007 


de Oliveira-Costa et al.|2008 


Jelic et al.|2008 


). These 



act as contaminants to the signal and include unresolved 
extragalactic point sources, resolved bright point sources, 
and Galactic synchrotron radiation. Taken together, these 
sources contribute a brightness temperature on the order of 
hundreds of Kelvins, which easily overwhelms the cosmo- 
logical 21 cm signals that are expected to be on the order 
of mK. The ability to subtract off foregrounds is therefore 
crucial. 

Foreground subtraction in 21-cm tomography is partic- 
ularly challenging because current-generation instruments 
are not optimized for imaging. Instead, most experiments are 
designed to provide cosmological information through statis- 
tical measurements such as the power spectrum. Images of 
the 21-cm sky are expected to be of a rather poor quality 
in terms of both signal-to-noise and synthesized beam, and 
image-processing techniques devised to optimally mitigate 
this problem may be too computationally intensive to be 
useful. 

In this paper, we take a conservative approach and in- 
vestigate how well foreground cleaning can be done with a 
computationally simple method using only the the so-called 
dirty maps - distorted maps of the sky as seen by an instru- 
ment at many different frequencies. Specifically, we clean 
out the foregrounds in any given sky direction by fitting a 
model of the contamination to the corresponding sky pixels 
in all the dirty maps. This approach of cleaning one sky pixel 
at a time along the line-of-sight allows one to take advan- 
tage of the high spectral resolution offered by most 21 cm 
tomography experiments. The key idea is that explored in 
[Wang et al.| ( p006| : the point source emission (mainly syn- 
chrotron radiation is expected to vary smoothly with fre- 
quency while the redshifted 21 cm signal of interest varies 
rapidly, because a small change in frequency (and hence red- 
shift) corresponds to many Mpc in the radial direction and 
hence a major difference in local density fluctuations etc. 
High spectral resolution should thus allow one to separate 
out a rapidly oscillating cosmological signal from the spec- 
trally smooth foreground using a low-order polynomial fit. 



While previous studies (Wang et al. 2006 Jelic et al 



2008) have already highlighted the potential of this ap- 
proach, such studies are incomplete because they do not 
take frequency dependent beam effects into account. At a 
given frequency, the dirty map is the true sky multiplied by 
a broad (say 10°) function known in radio astronomy as the 
primary beam and then convolved with a function referred to 
as the synthesized beam. The primary beam width is of order 
9 ~ A/Dantcnna, whcrc Dantenna IS the physical sizc of the 
individual array elements, whereas the synthesized beam is 
the Fourier transform of the array layout. Figure [T] shows a 
synthesized beam example illustrating several key features: 

• There is a narrow central peak whose width is of order 
9 ~ A/Darray, where Darray is the size of the whole antenna 
array. This peak width is usually referred to as the angular 
resolution of the experiment. 

• There is positive and negative "frizz" extending far be- 



yond the central peak, corresponding to incomplete coverage 
of the Fourier plane, causing random-looking low amplitude 
oscillations on the angular resolution scale. 

• The synthesized profile is the same at all frequencies 
except for an overall scaling with the wavelength A. 

As one shifts to higher frequencies, the synthesized beam 
therefore shrinks like v'^ , causing the contributions from 
individual point sources to oscillate rapidly as positive and 
negative parts of the frizz moves across them. This produces 
an effective foreground signal looking not like a smooth syn- 
chrotron spectrum, but more like the rapidly oscillating cos- 
mological signal. 

Early work on this problem has provided cautionary but 
encouraging results ( Bowman|2007 |. Our goal in this paper 
is to quantify in detail how well the simple pixel-by-pixel 
foreground subtraction strategy can deal with this problem, 
using simulations including realistic radio arrays and beams. 
[Bowman et al.| ( |2008[ ) have independently studied this prob- 
lem with a complementary approach, and we compare our 
results with theirs below. Whereas they perform a detailed 
case study of how well the MWA experiment can meet this 
challenge, including all foregrounds, we instead tackle the 
broader question of how residual point source contamination 
depends on experimental specifications, and what the exper- 
imental design implications are. We ignore non-point-source 
contributions to the foregrounds, with the expectation that 
such components (such as Galactic synchrotron radiation) 
are rather benign relative to the unresolved sources. Being 
spatially smooth, the off-beam contributions of such fore- 
grounds tend to average out, unlike the point sources, even 
though their expected amplitude is larger. [Bowman et al.| 
( 2008 1 confirm this expectation. 



In this paper, we deal only with the removal of un- 
resolved point sources, assuming that resolved and detected 
sources can be eliminated by standard radio astronomy tech- 
niques as well as discarding the most contaminated pixels. 
We thus envision the cleaning of foregrounds as a two-step 
process: first masking or deconvolving out bright resolvable 
point sources that exceed some flux cut, then proceeding 
with our proposed algorithm. We focus solely on the viabil- 
ity of the second step, but explore how the results depend 
on the flux cut from the first step. 

The rest of this paper is organized as follows. In Section 
[2] we outline the simulation methodology. We go through the 
simulation of foregrounds, the simulation of radio interfer- 
ometers, and the proposed foreground subtraction strategy 
itself. The overall results of our analysis are presented in 
Section [3.2[ where we consider three possible scenarios for 
foreground subtraction, ranging from the most pessimistic 
to the most optimistic. In Section [3.3[ we quantify which 
design and algorithm choices are most important, and ex- 
amine how the interplay between various instrumental and 
algorithmic parameters gives rise to these results. Table [l] 
lists the parameters that we explore as well as the ranges 
over which we vary them. In Section |4] we summarize our 
results and discuss future prospects. 



2 METHODOLOGY 

Our basic approach is to simulate a point source sky, simu- 
late observed maps of it at multiple frequencies, clean these 
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Table 1. Range of our parameter space exploration for foreground cleaning. Parameters pertaining both to experimental specifications 
and to analysis method impact the success of the cleaning. 



Assumptions 



Low-Performance 
Extreme 



Fiducial Model 



High- Performance 
Extreme 



Experimental 


Tile Arrangement 


p{r) r 


p(r) ~ r ^ 


Monolithic with tiles 
separated by 40 m 




Rotation synthesis 


None 


6 hours, continuous 


6 hours, continuous 




Noise level 


cr-jp ~ 1 mK 


Noiseless 


Noiseless 


Analysis 


Primary beam 
width adjustments 


None 


None 


Adjusted to be 
frequency-independent 




Bright point source 


100 mjy 


10 mJy 


0.1 mJy 




flux cut Scut 










synthesized beam 


None 


None 


Resolutions equalised 




width adjustments 






by extra smoothing 




u-v plane weight- 
ing 


None (natural) 


Uniform 


Uniform 




Order of polyno- 


Constant 


Quadratic 


Quintic 




mial fit 










Range of polyno- 


80 MHz 


2.4 MHz 


2.4 MHz 



mial fit 



maps and quantify the residual contamination that remains 
after the cleaning. We repeat this a large number of times 
to explore the range design and algorithm options listed in 
Tabled 

A key feature of our cleaning method (which, again, is 
essentially that proposed in [Wang et al.| ( |2006[ ), but with 
dirty map effects taken into account) is that the algorithm 
is blind, in the sense that our method does not rely on 
the any physical modeling of foregrounds. This is impor- 
tant because the foregrounds relevant to 21-cm tomography 
are still relatively poorly understood (as compared to, say, 
the foregrounds relevant to cosmic microwave background 
experiments), and models are often based on interpolations 
and extrapolations from other frequency bands ( de Oliveira- 
|Costa et al.||2008l ). Because of this, we choose to use only 
the generic property that radio frequency foregrounds are 
smooth functions of frequency. This means that in our anal- 
ysis, the foreground and instrumental simulation steps are 
not only separate from each other, but also completely de- 
coupled from the foreground subtraction step. Thus, in what 
follows we divide our description of the methodology into 
each of those steps. 



2.1 Step I: Simulation of Foregrounds 

While point sources tend to cluster and are therefore not 
randomly distributed, the clustering is rather weak, and for 
simplicity we make the the assumption that they are uncor- 
related. We thus simulate the contribution to each sky pixel 
independently, with its flux being the sum of the fluxes of a 
large number of of randomly generated point sources. The 
brightness of each point source is randomly drawn from the 



^ Our baseline simulations are run without noise because we show 
in Section [3.2| that the noise contribution can be included analyt- 
ically and is unimportant for evaluating the effectiveness of our 
foreground subtraction strategies. 



source count distribution 
dn 



dS 



(4.0 mJy" 



880 mJy 



(1) 



which is appUcab le at = 150 MHz \Di Matteo et al.|2002 



Lidz et al. 20071. The spectral dependence of each point 



source is given by 



(2) 



where the spectral index a is randomly chosen from a Gaus- 
sian distribution 



p{a) = 



: exp 



(q - ao)^ 



2cr2 



(3) 



V27rcr2 

with ao = 2.5 and a = 0.5 ( [Tegmark et al.||2000[ ). Putting 
this all together, the brightness of each pixel is found by sim- 
ply summing the simulated point sources within that pixel: 



dT 



■V 5Z ^' 



sky 



(4) 



where S* is the flux of the z* ^ point source at 150 MHz, and 
dB/dT = 6.9xlO^(i//!/*)^mJyK"Sr~^ is the standard con- 
version between intensity and brightness temperature. We 
use a pixel size of flsky = 4.3 arcmin'^ in our simulations. 
In generating our point sources, equation [T] is truncated 
at some maximum flux Smax = Scut because the bright- 
est point sources are assumed to be detected and separately 
removed as mentioned above. We leave the threshold Scut as 
a free parameter, and we investigate how the effectiveness 
of our foreground subtraction algorithm changes as Scut is 
varied from O.lmJy to 100 mJy. At the dim end of the dis- 
tribution, we truncate at a minimum flux Smin because the 
source count distribution diverges as S ^ 0. We flnd that 
it is unnecessary to have 5min be any lower than 10"'' rnjy, 
as the total flux / has converged by then. These values for 
Suiin and Smax mean that we typically simulate A'^ ~ 2000 
point sources point sources for the sum in equation [4] 

All the simulations were performed on 1024 x 1024 grids, 
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Figure 2. A sample sky with point source foregrounds at u 
159 MHz 



and we verified that increasing the resolution to 4096 x 4096 
had essentially no effect on the results. In the line-of-sight 
direction, the lowest frequency that we simulate is u — 
158.73 MHz (corresponding to z = 8), and (except for in sec- 
tion 3.3.9 1 the frequency increases by = 0.03 MHz from 

i slices. In section [3.3.9| 
158.73 MHz, 



one frequency slice to another over I 
we continue to simulate 80 slices starting at f 
but we increase Av to 0.3 MHz and 1.0 MHz to quantify the 
effect of increasing the range of the polynomial fit. A sample 
sky at V — 159 MHz is shown in figure [2] 

Performing the sum in equation [4] at all frequencies for 
all 1024^ pixels would take on order a month with our soft- 
ware. We therefore accelerate our algorithm by exploiting 
the fact that the pixels are independent random variables. 
We first generate a database of 1000 pixels for which we com- 
pute the full frequency dependence. We then give each sky 
pixel the frequency dependence of a random pixel from the 
archive. It is easy to show that this procedure does not bias 
the residual power spectra low or high, but merely increases 
the variance in our estimation of these power spectra. By 
computing the scatter between multiple analyses using dif- 
ferent random seeds and database sizes, we find that this 
excess scatter becomes unimportant for our purposes once 
the database exceeds a few hundred pixels. 

We re-emphasize that the process used here pertains 
only to the simulation of foregrounds. The foreground sub- 
traction itself is blind and therefore does not depend on the 
assumed properties of the foregrounds. 



2.2 Step II: Simulation of Radio Interferometer 

As mentioned above, the response of a radio interferometer 
array can be described by two beam functions: the primary 
beam and the synthesized beam. The primary beam function 
encodes the power response of an individual interferometer 
element to different parts of the sky, while the synthesized 
beam is a convolution kernel that describes the interferomet- 
ric effects of the array as a whole. To a good approximation, 
what an interferometer sees is not the true sky, but the true 



sky multiplied by the primary beam and then convolved with 
the synthesized beam. 

In our simulations, we approximate the primary beam 
by a Gaussian. Its width scales as A/Dantenna, and is chosen 
so that the width matches simulated antenna patterns of the 
Murchison Widefield Array (MWA) dipole antennas at 150 
MHz (approximately 30° full-width-half-max) . The synthe- 
sized beam is found by taking the Fourier transform of the 
distribution of baselines (in the so-called u-v plane), which 
depends on the arrangement of tiles in the interferometer 
array. While our simulations of course require specific real- 
izations of the array layout, we expect our results to hold for 
any radio array whose tile distribution is qualitatively simi- 
lar to the various scenarios we consider in the results section. 
Roughly speaking, this means that our results should be ap- 
plicable to any array with a relatively large number of short 
baselines and a number of long baselines that span up to 
~ 1 km, giving an angular resolution that is on the order of 
arcminutes. For an array with very different dimensions, our 
results can be straightforwardly scaled. 

Integration time affects the response of the interferom- 
eter array in two ways that are important here. First, de- 
tector noise averages down with time as t~^^'^ as long as it 
is uncorrelated over different time periods. A typical point 
source sensitivity for a current-generation experiment like 
the MWA is S = 0.27 mjy, with a 4.6 square arcminute 
pixel, and a 32 MHz bandwidth. This corresponds to an 
r.m.s. detector noise per a pixel of ( Wang et al.|200'6 l: 



MWA n on / 32MHz f 1 hour \ 



V Ai 



1/2 



t J 



(5) 



where Av is the bandwidth, and t is the integration time. 
The MWA has a channel bandwidth of 32 kHz, and with 
~1000 hours of integration the detector noise level is ~ 0.2 K. 
In our simulations we consider not only current-generation 
experiments with this level of noise, but also the noise levels 
of hypothetical future experiments, which we take to be of 
order ~ 1 mK. 

The second effect of taking measurements over an 
extended period of time pertains to the concept of rotation 
synthesis. The rotation of the Earth during observations 
means that interferometer baselines are not static points 
on the u-v plane, but instead sweep out of u-v tracks. This 
results in a change in the properties of the synthesized 
beam, as one can see in figure [3] In our analysis, we explore 
a range of rotation synthesis scenarios from none at all (i.e. 
taking a snapshot of the sky) to having 6 hours of rotation 
synthesis. 



2.3 Step III: Foreground Subtraction 

As mentioned above, we use a pixel-by-pixel line-of-sight 
cleaning strategy for removing the unresolved foregrounds 
below the flux threshold. For each pixel, its frequency de- 
pendence is fit by a low-order polynomial. Such a polyno- 
mial is by definition a smooth function, and thus the hope 
is that by subtracting off the fit from the signal, one sub- 
tracts off mainly the smooth foregrounds and not much of 
the cosmological signal, which is expected to oscillate wildly 
with frequency. We leave the order of the polynomial as a 
free parameter, and explore the effects of varying it from 
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Real space (beam) 



Fourier space (distribution of baselines) 







Figure 3. The left hand column shows sample beam profiles (a 
real-space description of the beam) while the right hand column 
shows the corresponding u-v distribution of baselines (a Fourier- 
space description of the beam) . The top row illustrates an array 
with no rotation synthesis, while the bottom row shows an ar- 
ray with 6 hours of rotation synthesis. The real-space beams are 
normalized so that their peaks are at 1. 



(constant) to 5 (quintic). This involves a tradeoff between 
two separate effects. On one hand, the higher the order of 
the fit, the lower the residual foregrounds will be. On the 
other hand, high order fits come at the expense of fitting 
more power out of the cosmological signal. Crudely speak- 
ing, one therefore wishes to select the lowest order capable 
of adequately subtracting off foregrounds. 



3 RESULTS 

In this section, we quantify the extent to which point source 
contamination can be removed with line-of-sight cleaning, 
and how this depends on the many parameters in Table 1. 
To get a sense of the extent to which the parameters mat- 
ter, we then analyze three scenarios designed to bracket the 
range of possibilities in Section |3.1[ As seen in Figure [4j 
this range is broad, extending all the way from utter fail- 
ure to recover the cosmological signal to success in pushing 
foregrounds two orders of magnitude below the signal. Af- 
ter discussing some general findings that are independent of 
all these parameters in Section |3.2[ we devote Section |3.3| 
to examining the parameters one by one to quantify which 
ones make the greatest difference. 



3.1 Three scenarios 

We now analyze three scenarios designed to bracket the 
range of possibilities: 

(i) Pessimistic scenario (PESS) 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 

Comoving k in h/Mpc 

Figure 4. 2D power spectra of foregrounds and foreground resid- 
uals for the various scenarios outlined in the text. It is clear that 
except for the most pessimistic scenario, a dirty-map pixel-by- 
pixel cleaning strategy can be used to get within at least striking 
distance of the cosmological signal. 



• Experimental assumptions: In this scenario, we sim- 
ulate an array whose tiles are arranged in a radial density 
profile that goes as . The array takes a single snapshot 
of the sky. 

• Analysis: No extra weighting is imposed on the u-v 



plane (see Section 3.3.7 for details on weighting schemes). 



and no attempt is made to adjust for the fact that both the 
primary beam and the synthesized beam have frequency- 
dependent widths. It is assumed that only sources that are 
10 mjy or brighter can be removed prior to the subtrac- 
tion of unresolved point sources. We also assume that the 
highest order polynomial that one can fit without taking 
out significant power from the cosmic signal is a quadratic. 

(ii) Fiducial scenario (MID) — this scenario is de- 
signed to be reasonably representative of current-generation 
experiments. 

• Experimental assumptions: Same as PESS except 
that the array samples the u-v plane continuously for 6 
hours. 

• Analysis: Same as PESS except that the u-v plane 
measurements are weighted so that all parts of the u-v 
plane that are covered are given a uniform weight. 

(iii) Optimistic scenario (OPT) 

• Experimental assumptions: Same as MID. 

• Analysis: Same as MID except that we assume that 
bright point sources can be removed down to 1.0 mJy, and 
that one can safely fit up to a cubic polynomial. 

Our fiducial model is a middle-of-the-road (MID) case in- 
tended to be representative of current generation experi- 
ments, while the pessimistic (PESS) and optimistic (OPT) 
scenarios refer to worst and best case experimental expecta- 
tions, respectively. The PESS and OPT scenarios in general 
differ less than the low-performance and high-performance 
extremes outlined in Table[l]because the parameters of these 
extreme scenarios are in some cases unrealistic, and are con- 
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sidered in Section |3.3| solely to better understand how the 
various parameters affect foreground subtraction. 

As our measure of how well or poorly the cleaning 
works, we use the two-dimensional spatial power spectrum 
of the cleaned map. We do this because the power spectrum 
of the cosmological signal is the key quantity that the 21 cm 
tomography community aims to measure in the near term, 
and the residual point source power spectrum will simply 
add to this. In all our power spectrum plots (starting with 
Figure [4| , we have removed the distorting effect of convolu- 
tion with the synthesized beam, so that P2D{k) for the point 
sources is a constant (fc-independent) white noise spectrum. 
The detector noise also has a constant spectrum, and the 
cosmological 21 cm signal plotted is what would be seen by 
an ideal instrument making a distortion-free image of the 
sky. The deconvolution step is trivial to perform in Fourier 
space, corresponding simply to dividing the measured power 
spectrum by the radial distribution of baselines. 

Rather than plot P2D{k) itself, we plot the dimension- 
less quantity [k'^ P2D{k)]^^^ , which can be roughly inter- 
preted as the dimensionless fluctuation lever, analogous to 
the standard quantity 5t /T oc [I'^Ce]'^ for the cosmic mi- 
crowave background and the quantity A oc [k^ Psoik)]^^^ 
for three-dimensional galaxy surveys. In the figures that fol- 
low, [k^ P2D{k)Y^^ is always plotted for the frequency slice 
at v = 158.73 MHz. This translates to a redshift z — 8, 
around the "sweet spot" of most current generation 21 cm 
experiments. 

Our cleaning method is linear, i.e., the "data cube" con- 
taining the cleaned maps at different frequencies is simply 
some linear combination of the input data cube, and the 
weights in this linear combination (implementing the poly- 
nomial fits) are fixed, independent of what the data cube 
contains. This means that if we have three data cubes, con- 
taining cosmological 21 cm signal, detector noise and point 
source signal, respectively, we get the same result if we sum 
them and then clean as if we first clean them individually 
and then sum them. Since these three components are all 
statistically independent, this also implies that their post- 
cleaning power spectra simply add. We take advantage of 



this fact (which we also verify numerically in Section 3.3.21 
to simplify our analysis, performing most of our simulations 
with both the cosmological signal and the noise set to zero. 
A perfect foreground subtraction algorithm would thus pro- 
duce a residual power spectrum that is zero everywhere. 

Figure [4] shows the results. For comparison, the figure 
also shows the simulated FOR signal for 2 = 8 (taken from 
McQuinn et al.| ( |2006| ) and [McQuinn et al.| ( |2007[ )). Figure 
4] shows that except for in the PESS scenario, our simple 
foreground subtraction algorithm is successful on large scales 
but not on small scales. Let us now focus on understanding 
this better, and clarifying the key properties of the cleaning 
method. 




159 159.5 160 160.5 

Frequency in MHz 



Figure 5. The spectra of a three typical dirty-map pixels, with 
fits given by the dotted curves. The top panel assumes no in- 
strumental noise and no point sources brighter than 0.1 mjy. The 
middle panel assumes no instrumental noise and no point sources 
brighter than 100.0 mJy. The bottom panel assumes an instru- 
mental noise level of or = 200 mK and no point sources brighter 
than 0.1 mJy. 



3.2 Why the method can work well on large scales 

In the top panel of figure [5] we show the spectrum for a 
pixel randomly chosen from the dirty map produced by a 
noiseless but otherwise MWA-like instrument that has been 
continuously observing a fixed patch in the sky for 6 hours. 
A bright point source flux cut of Scut ~ 0.1 mJy has been 
assumed. In figure[5]we can see that while the overall trends 



of the spectrum can be captured by a low-order polynomial 
fit, the finer features generally remain as residuals. 

The effectiveness of the polynomial fit depends on a 
variety of factors. We find that the greatest variation comes 
from varying Scut s^nd the noise level. In the middle panel 
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of figure [sj we show a typical pixel with Scut ~ 100.0 mj^j^ 
Similarly, the addition of noise can decrease the quality of 
the fit. In bottom panel of figure [S] we have added ctt = 
200 mK (note the larger amplitudes on the y-axis compared 
to what is shown in the top panel). 

From figure[5j the reader may be surprised that our sub- 
traction strategy works at all. Indeed, it seems from the fig- 
ure that low order polynomial fits do extremely poorly, and 
that most of the foregrounds will remain after the subtrac- 
tion. So why, then, does Figure [4] show that the foregrounds 
can be suppressed by many orders of magnitude? 

The above-mentioned linearity holds the key to the suc- 
cess. To understand why, consider again the 3-dimensional 
data cube, and imagine it arranged so that the 2-dimensional 
dirty maps are all horizontal, stacked vertically on top of 
one another so that the vertical direction corresponds to 
frequency. To generate a plot such as Figure |4j we perform 
two operations on this cube of simulated data: 

(i) Clean one pixel (vertical column) at a time 

(ii) Fourier transform one frequency map (horizontal 
slice) at a time 

Since both of these steps are linear, they can each be thought 
of as multiplying all the data (rearranged into a single vec- 
tor) by some matrix. Since the two steps mix data purely 
horizontally and purely vertically, respectively, it is easy to 
see that the corresponding two matrices must commute. In 
other words, we get the same result if we perform the steps in 
the opposite order (our simulations confirm this) . An analo- 
gous and more familiar example is the 3-dimensional Fourier 
transform, which decomposes into successive Fourier trans- 
forms in the vertical and two horizontal directions, and again 
gives the same result regardless of the order in which these 
operations are performed. 

This means that we get the exact same result in Fig- 
ure[4]if we first Fourier transform the maps, then perform the 
cleaning one Fourier mode at a time instead of one pixel at a 
time. Figure [6] shows the spectral fit to a sample pixel in on 
the Fourier {u-v) plane, revealing a very smooth curve that 
can be fit with exquisite accuracy. The success for this par- 
ticular Fourier mode hinges on the fact that it lies in the cen- 
tral part of the Fourier plane which has been well sampled by 
the array at all frequencies. This explains the low plateau in 
the residual power spectra in the left side of Figure [4] The 
sharp rise in residual power on smaller scales corresponds 
to incomplete Fourier coverage, with certain interferometric 
baselines missing. Converting this intuitive understanding 
back to real space, the small-scale synthesized beam frizz 
seen in Figure [l] is largely averaged away when expanding 
the sky into long-wavelength Fourier modes, whereas the 
small-scale modes are severely affected. The characteristic 
scale separating "short" and "long" Fourier modes is de- 
termined by the longest baseline radius for which u-v cov- 
erage is complete. As one moves outwards from the origin 
in Fourier space, one eventually reaches a pixel where u- 
V coverage is incomplete at some frequency along the line 

^ Note that in general, dirty-map pixels may be negative. Such 
negative brightness temperatures arise from the fact that in radio 
arrays, the signal from a single tile is never correlated with itself. 
We thus never sample the origin of the u-v plane, and therefore 
lose the mean of the signal. 
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Figure 6. The spectrum of a Fourier space pixel, taken from the 
part of the plane where u-v coverage is complete. A fit to the 
spectrum is shown. 



of sight. Once there is any such missing information along 
the frequency direction, our simple fitting algorithms fail to 
find a good fit and the result is a large increase in residu- 
als. Since this increase is caused by incomplete u-v cover- 
age, the scale at which this occurs depends on the rotation 
synthesis scenario being considered. We find, however, that 
for most reasonable cases the upswing takes place between 
k = 0.3/i/Mpc and 0.4VMpc. 

The reader may also be concerned (based on figure [5| 
that there may be much residual power in the frequency di- 
rection, leading to problems for experiments aiming to mea- 
sure the 3D power spectrum. A full discussion of the 3D 
power spectrum is beyond the scope of this paper, but for 
now we note that plots like figure [S] are simply real space 
plots in the line-of-sight direction. Fourier transforming in 
the perpendicular direction, the fit becomes quite good in 
the central parts of the u — v plane as mentioned above. 
Fourier transforming in all three directions (which is what 
one would ultimately do in order to find estimate the 3D 
power spectrum), one does not expect the residual power to 
depend on both the line-of-sight and transverse wave num- 
ber. Indeed, Bowman et al. (20081 find that while there is 

there do exist 



some contamination along the line-of-sight, 
clean regions in the full 3D Fourier space. 

Aside from residual foregrounds, another problem that 
may arise is the fact that blind algorithms such as ours have 
no way to distinguish between signal and foreground except 
for through differences in smoothness as a function of fre- 
quency. Thus, if the signal possesses any smooth large-scale 
component, the foreground subtraction algorithm may in- 
advertently remove this component of the signal with the 
foregrounds. Deciding whether this is a problem or not (and 
quantifying the seriousness if it is indeed a problem) re- 
quires the detailed examination of the properties of the sig- 
nal, which is beyond the scope of this paper. Intuitively, how- 
ever, one can say (at the very least) that there will be no loss 
of cosmological signal in the transverse Fourier components, 
since our algorithm subtracts along the line-of-sight. 
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Figure 7. Dependence on array layout: 2D power spectra for the 
fiducial model with various arrangements of tiles. Monolithic and 
arrangements both seem to work well. 

3.3 Exploration of parameter space 

Above we saw that there was a huge difference in foreground 
removal ability between the three scenarios. Let us now in- 
vestigate which of the instrumental and algorithmic param- 
eters in Table [l] make the greatest difference. We do this 
by varying one of them at a time, over the range given in 
Table [T] while keeping all others fixed. 

3.3.1 Array Layout 

In a radio array, it is the arrangement of antenna elements 
that determines the distributions of baselines, and it is the 
distribution of baselines that determines the shape of the 
synthesized beam. Thus, our ability to subtract foregrounds 
depends strongly on the layout of the array. 

In an experiment such as the MWA, each antenna el- 
ement consists of a tile of 16 crossed dipole antennas, ar- 
ranged in a 4 by 4 grid with the dipoles spaced roughly 1 m 
from each other. In this configuration, the dipoles give rise 
to a complicated primary beam pattern which may include 
structures such as sidelobes. As mentioned in previous sec- 
tions, we neglect these complications and assume that each 
tile of dipoles has a primary beam that takes the form of a 
Gaussian with a 30° fuU-width-half-max. 

We consider two different types of tile arrangement. One 
has the density of tiles varying as r~^ , where r is the radius 
from the centre of the array. The other, which we catego- 
rize as a "monolithic" arrangement, is a regular square grid 
densely packed near the centre of the array. A few tiles are 
placed farther from the centre to provide some long baselines 
for calibration purposes and for good angular resolution. In 
both cases, we have a total of 500 tiles in our simulations; 
for the monolithic cases 400 tiles form the central core, while 
100 tiles scattered outside the core provide some short base- 
lines (which can be formed between two close-by tiles that 
are both outside the core) and many long baselines (which 
are formed by pairing up a tile within the core to one out- 
side the core). We consider two realizations of the monolithic 
cases, one with tiles in the central core separated by 25 m 
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Figure 8. Effect of noise: 2D power spectra for the fiducial model 
with various levels of instrumental noise show that the power 
spectrum of detector noise simply gets added to the power spec- 
trum of the residual foregrounds. 

and with them separated by 40 m. In both cases, the combi- 
nation of short-to-medium-length baselines formed between 
tiles in the core and the extra few short baselines provided 
by closely separated tiles outside the core allow complete 
coverage of the u-v plane near the origin with the help of 
rotation synthesis. 

The results are shown in figure [T] From the plot, it is 
clear that good foreground subtraction can be done using ei- 
ther an arrangement or a monolithic arrangement, and 
that on large scales there is no difference in performance. 
Monolithic arrangements, however, seem capable of pushing 
foreground subtraction to slightly finer scales. The expla- 
nation for this is that with an arrangement, the high 
density of tiles at small r means that the inner parts of the 
u-v plane are vastly oversampled given the rotation synthe- 
sis, and one can spread out the tiles slightly to extend the 
perfectly covered regions of the u-v plane without introduc- 
ing any gaps thanks to the tight monolithic core. 

3.3.2 Noise Level 

As mentioned above, we consider three different scenarios in 
our analysis of noise effects: a noise level ar ~ 200 mK (rep- 
resentative of current-generation instruments), a noise level 
(tt ~ 1 mK (hopefully representative of next-generation in- 
struments), and a hypothetical noiseless case. The resulting 
power spectra are shown in figure [8] For all but the noiseless 
case, there are in fact two sets of curves plotted: one show- 
ing the result of the simulation with noise, and the other 
showing the noiseless simulation with the input noise power 
spectrum (a constant) added afterward. These two sets of 
curves lie on top of each other and are visually indistinguish- 
able, confirming that our linear cleaning method leaves the 
noise power essentially unaffected. It is therefore unneces- 
sary to run simulations with noise, since the results can be 
predicted analytically given a noiseless simulation. 

A naive reading of Figure [8] suggests the pessimistic 
conclusion that current-generation 21 cm tomography ex- 



Will point sources spoil 21 cm tomography? 9 



10 - Original sky 




0.2 0.3 0.4 0.5 

Comoving k in h/Mpc 



0.6 0.7 



10^ - Original sky 




0.1 



0.2 0.3 0.4 0.5 

Comoving k in h/Mpc 



0.6 0.7 



Figure 9. Effect of rotation synthesis: 2D power spectra are sfiow 
for the fiducial model but with various total rotation synthesis 
times. The effect of lengthening integration time to increase u-v 
coverage is seen to saturate after about 4 hours. 



Figure 10. Effect of temporal binning; 2D power spectra for the 
fiducial model but with varying binning time At, showing that 
one should strive for continuous u-v coverage. 



periments have no hope of seeing a cosmological signal. How- 
ever, the additive contribution to the power spectrum from 
noise can be completely eliminated by measuring the power 
spectrum by cross correlating maps made at different times, 
since their noise will be uncorrelated. The WMAP team have 
successfully used an analogous procedure for noise bias re- 
moval, where they cross correlated maps made not at dif- 



ferent times but with different receivers (Hinshaw & et. al 
I [WMAP collaboration] ||2007| ). In contrast, the contribution 
from residual point sources can not be eliminated in this 
way. 

Just like for WMAP, the noise will still contribute to 
the error bars AP{k) on the measured power spectrum, 
but these error bars can be shrunk by averaging many 
Fourier modes with comparable wave number k. For the 
case where noise and signal have a Gaussian distribution, 
AP(k) = P{k) where P{k) is the total power spec- 

trum (including noise and residual point sources) and 
is the number of Fourier modes averaged. For example, the 
noise power exceeds the signal power around the third acous- 
tic peak of the CMB power spectrum measured by WMAP, 
but N is large enough that the error bars nonetheless be- 
come significantly smaller than the cosmic signal. In our 
case, binning radially in annuli of thickness k = 0.1/i/Mpc, 
we typically have TV ~ 10'* to 10^ (depending the value of 
fe), which would suggest percent level relative errors on the 
plotted power spectrum curves. 

In summary, our results regarding noise are quite en- 
couraging: the promising forecasts that have been made in 
the past for what can be learned from 21 cm cosmology all 
included detector noise, but not the point source contribu- 
tion in the full complexity that we are modeling. Our results 
show that the fact the point sources can be removed without 
having much effect on the noise levels. 



3.3.3 Rotation Synthesis 

In a maximally optimistic situation, rotation synthesis can 
dramatically boost one's ability to image the sky, since for 
every baseline, one effectively obtains an entire arc of base- 
lines on the u-v plane. However, rotation synthesis is not 
always as readily available as one might hope. For instance, 
a declination zero object as viewed from the equator does 
not rotate but merely translates across the sky, allowing no 
rotation synthesis. 

As another example, consider a patch of the sky that 
lies close to the horizon. The short time interval between the 
rising and setting of the patch means that even if continuous 
rotation synthesis is possible, one may not obtain sufficient 
total rotation synthesis time. (Observing the patch again on 
the next night does not alleviate the problem, for a side- 
real day later one is simply sampling the same points on the 
u-v plane again). One can also imagine a situation where 
u-v coverage is poor even for a patch that remains in the 
sky for most of the night because of limitations in hardware 
calibration and data fiow. In such a scenario, each baseline 
would sweep out a long track on the u-v plane, but this track 
would only be sparsely sampled by a series of snapshots of 
the sky. We thus use two separate parameters to parame- 
terize the quality of rotation synthesis: the total rotation 
synthesis time and the time At between snapshots of the 
sky. 

As expected, the performance of our foreground sub- 
traction algorithm improves as one increases the total ro- 
tation synthesis time. In figure |9j we show the effect of al- 
lowing integration time to increase from a single snapshot 
to 6 hours (beyond which there is no significant improve- 
ment in foreground subtraction). Each curve was simulated 
by assuming continuous rotation synthesis (i.e. At 0) us- 
ing an instrument located at latitude and longitude (A, 0) = 
(— 27°,0°) for a field with right ascension and declination 
[a, 5) = (60°, -30°Q We see that the first few hours of in- 



^ The latitude, right ascension, and declination were chosen to 
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Figure 11. Effect of primary beam equalization: 2D power spec- 
tra for the fiducial model but with different algorithms for deal- 
ing with the fact that the primary beam width changes with fre- 
quency. Adjusting for the frequency dependence is seen to improve 
foreground subtraction slightly. 



Figure 12. Beam profiles after an extra Gaussian convolution, 
designed to make the heights and widths of the central peaks 
frequency-independent. Parts of the beam beyond the central 
peak, however, will in general remain dependent on frequency. 



tegration give rise to dramatic improvements, and that the 
gains begin to saturate after about 4 hours. Intuitively, the 
saturation occurs because after several hours of integration, 
one is revisiting parts of the plane that have already been 
well sampled by other baselines, and so there is no further 
improvemenlj^ As mentioned earlier in this section, the ef- 
fectiveness of rotation synthesis depends on the location of 
the array as well as the sky coordinates of the patch being 
observed. We find, however, that our conclusion is fairly ro- 
bust to changes in array location and sky coordinates. In 
other words, after 4 hours of observation one has essentially 
exhausted the potential of rotation synthesis, however small 
or large this potential may be. 

In figure |10[ we instead fix the integration time at 6 
hours and vary At. It is clear that one should strive for 
continuous u-v coverage if possible. It should also be noted 
that with both parameters, it is generally the smallest scales 
that benefit from rotation synthesis - the array layout 
provides enough short baselines to ensure reasonable cover- 
age on large angular scales even without exploiting Earth 
rotation. 



3.3.4 Primary beam width adjustments 

As mentioned in Section |2.2| the width of a radio array's 
primary beam is proportional to A. Thus, even if one has 
perfect coverage of the u-v plane, the sky will look different 
at different frequencies. In doing foreground subtraction, we 
consider two possible ways to analyze the data: 
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Figure 13. Effect of synthesized beam adjustment: the attempt 
to smooth all maps to a common resolution before cleaning is 
seen to do more harm than good. 

(i) Do nothing, and simply accept the fact that the pri- 
mary beam width changes with frequency. 

(ii) Smear the u-v plane data to smooth out the primary 
beams so that all the primary beams have the same width 
as the widest beam in the frequency range. 

The results are shown in figure [TT] A close look at the plot 
reveals that adjusting for the frequency dependence of the 
primary beam does in fact bring about an improvement, 
albeit a small one, in our ability to subtract foregrounds. 



match planned observations by the MWA. The longitude was set 
to zero for computational simplicity since its precise value does 
not result in qualitative changes in rotation synthesis. 
* That does of course not imply that one should stop integrat- 
ing after 4 hours of observation, for repeated measurements of the 
same u-v points increases signal-to-noise. Indeed, current obser- 
vation plans call for thousands of hours of integration. 



3.3.5 synthesized beam wtdth adjustments 

The synthesized beam width is also expected to scale as A, 
as illustrated in figure [T] In addition to the "frizz" issue 
that we have focused on so far, a second potential cause 
for concern is the width change of the central peak, as it 
means that sources slightly off from the centre of the beam 
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will appear dimmer at higher frequency. This can potentially 
degrade the spectral fits. One way of adjusting for this is to 
convolve the dirty maps with frequency-dependent kernels 
whose frequency dependence exactly compensates for the 
changing width of the central part of the synthesized beam. 
The effect of this extra step is to ensure that, aside from 
the frizz effect, the sky has been convolved with the exact 
same beam at all frequencies. In other words, the angular 
resolution is made frequency independent by degrading the 
resolution of all maps to that of the lowest frequency map. 
In figure [12] we show the profiles of the beams from figure 
[1] after convolution with Gaussians of appropriate widths. 

As mentioned above, the procedure tested in this sec- 
tion deals with the central peak, and has no mitigating ef- 
fect on the "frizz". Since the "frizz" is responsible for non- 
cosmological line-of-sight structure, one expects the beam 
adjustment to have very little effect in the frequency di- 
rection. Figure [13] shows how this beam adjustment affects 
the the foreground subtraction in the spatial directions. On 
the largest and smallest scales, the extra convolution is seen 
to have no effect, and on intermediate scales smaller scales 
it is seen to make things slightly worse. This procedure of 
adjusting for changes in angular resolution with frequency 
therefore does more harm than good. The harm presumably 
comes from the exacerbating the frizz-related problems (by 
causing departures from uniform weighting in the Fourier 
plane discussed in Section 3.3.7 below). The good comes 
from correcting for the above-mentioned dimming effect. 
However, since A and hence the angular resolution varies by 
only a couple of percent across our frequency band, the point 
dimming effect will be very small for most point sources. 
Moreover, it will be a smooth function of frequency which 
can be accurately matched by our fitting polynomial, thus 
making angular resolution adjustments rather redundant. 



3.3.6 Flux Cuts 

Since it remains unclear down to what fiux cut 5cut one will 
be able to resolve and remove point sources with upcom- 
ing 21 cm tomography experiments, we examine how vary- 
ing Scut affects our algorithm for cleaning out unresolved 
point sources. In figure fTl) we vary Scut from 0.1 mjy to 
100.0 mjy. The results are shown in Figure and reveal a 
simple and useful scaling: raising the input power spectrum 
by some factor by increasing the fiux cut raises the output 
by the same factor. This means that there is no need to 
rerun simulations with many fiux cut levels, as the results 
from a single simulation can be analytically scaled to apply 
to all other cases. All we need to extract from simulations 
is the factor by which the cleaning algorithm suppresses the 
point source fluctuations — in this case, the suppression is 
seen to be about six orders of magnitude on large scales. 

Quantitatively, the residuals are seen to lie below the 
theoretical 21-cm signal as long as the fiux cut is below the 
100.0 mJy level. However, there is some variability in results 
with frequency, both from variations in the residual power 
spectrum and in the magnitude of the theoretical curve. To 
be conservative, it is therefore prudent to aim to be able 
to remove bright point sources down to the lOmJy level. 
This is consistent with the results in [Bowman et alT] ( |2008[ ), 
where it was found that foreground contaminants could be 
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Figure 14. Effect of flux cut for bright source removal: 2D power 
spectra for the fiducial model but with various bright point source 
flux cuts. 
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Figure 15. Effect of u-v plane weighting: 2D power spectra for 
the flducial model but with two different weighting schemes for 
the u-v plane. A uniform weighting of the u-v plane is seen to far 
outperforms natural weighting. 



subtracted to an acceptable level starting from a sky model 
with a fiux cut of 10 mJy. 

These results were derived using quadratic fits. In sec- 
tion |3.3.8[ we will see that the order of the polynomial has 
a strong effect on residuals, so if this 10 mJy goal cannot be 
met, then the residual power spectrum can alternatively be 
brought down further by increasing the order of the fitting 
polynomial. 



3.3.7 Weighting scheme on u-v plane 

In general, a radio array will not provide uniform coverage 
of the u-v plane - typically, some parts of the plane will be 
sampled more than once, while other parts will not be sam- 
pled at all. To compensate for this, one may decide to weight 
different parts of the u-v plane differently. In figure [151 we 
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Figure 16. Effect of fitting range: 2D power spectra for tiie fidu- 
cial model but with the foreground subtraction performed by fit- 
ting polynomials of various degrees to the spectra. 
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Figure 17. Effect of fitting range: 2D power spectra for the fidu- 
cial model but with the foreground subtraction performed by fit- 
ting quadratic polynomials to the spectra over a variety of fre- 
quency ranges. 



examine a "uniform" weighting scheme (where all sampled 
parts of the u-v plane are given equal weight) as well as a 
"natural" weighting scheme (where the u-v measurements 
are not given any weighting beyond their "natural" density 
as determined by the instrument). It is clear that a uniform 
weighting far outperforms a natural one. An intuitive ex- 
planation for why the uniform weighting so outperforms the 



natural weighting can be found in Bowman et al. ( 2008 1. Be- 



cause the filling of the u-v plane is accomplished slowly by a 
discrete set of baseline loci, the final distribution of baselines 
will not be particularly smooth. The frequency dependence 
of the synthesized beam then maps this u-v lumpiness into 
an incoherence in the frequency direction that is difficult to 
fit out using smooth polynomials. With a uniform weight- 
ing, one artificially normalizes the baseline distribution so 
that each pixel in the sampled part of the u-v plane has the 
same weight, thus ensuring that the distribution of baselines 
is smooth. 



therefore not fit over the broad frequency range probed by a 
typical 21-cm tomography experiment. Instead, one should 
divide the spectra into smaller frequency bands and subtract 
foregrounds individually from each band. 

The effect of changing the frequency range is of course 
rather degenerate with the effect of changing the degree of 
the fitting polynomial, since what matters is ultimately the 
number of degrees of freedom that are fit out per unit fre- 
quency. As we increase either the polynomial order or the 
number of frequency bands, we are removing cosmological 
signal on ever smaller scales along the line of sight. One 
should therefore fit over as broad a frequency range as pos- 
sible subject to the constraint that the predicted residuals 
lie comfortably below the expected cosmic signal. From fig- 
ure [17] we see that a frequency range of a few MHz appears 
appropriate for quadratic fits. 



3.3.8 Order of the polynomial fit 

As we mentioned above, one should aim to fit the spectra 
with polynomials that are as low-order as possible, with the 
constraint that the residuals need to be below the expected 
cosmic signal level. Figure [l6] shows that a quadratic fit sat- 
isfies this requirement for our fiducial case. If Scut cannot 
be pushed down to lOmJy, however, then a higher order 
fit may be necessary. For example, Scut = 100 mjy would 
require a cubic fit. 



3.3.9 Range of the polynomial fit 

A choice must be made regarding the frequency range over 
which polynomial fits are applied to spectra of dirty map 
pixels. In general, one expects post-cleaning foreground 
residuals to increase with increasing frequency range, and 
this is what is seen in figure [iTj where the foreground clean- 
ing was performed using quadratic fits over various fre- 
quency ranges. If quadratic polynomials are used, one should 



4 SUMMARY AND DISCUSSION 

We have studied the problem of cleaning out point source 
foreground contamination from 21 cm tomography data, and 
investigated how the level of residual contamination after 
the cleaning process depends on various experimental and 
algorithmic parameters. Our results show that while success- 
ful foreground removal is far from guaranteed, the signs are 
encouraging. For instance, the fact that the post-cleaning 
residual foregrounds lie below the simulated cosmological 
signal in the fiducial model of Section [3.2| is promising, since 
the scenario in question is considered to be representative of 
current-generation experiments, and its cleaning algorithms 
are both simple and computationally cheap. These conclu- 
sions agree with those of the independent and concurrent 



analysis of Bowman et al. (20081, which is complementary 



by focusing on the specifics of the MWA experiment. As a 
further consistency check on our calculations and software, 
we and the authors of that paper agreed on a test example 
that we both simulated, obtaining consistent results. 

We also identified a number of aspects of the problem 
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that can be understood analytically. Noise and cosmic sig- 
nal decouple from the point source problem as long as the 
cleaning method is linear. The power spectrum of unresolved 
point sources in the observed sky simply gets suppressed by 
a fixed fc-dcpcndcnt factor, so any shifts in this input power 
spectrum due to revised source count estimates or altered 
flux cuts for resolved source removal simply scale the out- 
put power spectrum by the same amount. Finally, simple 
Fourier space considerations explain why line-of-sight clean- 
ing works as well as it docs. 

Based on our analysis, we can make several specific rec- 
ommendations regarding the foreground subtraction of un- 
resolved point sources, pertaining to both instrumentation 
and data reduction. 

• Instrumental rccommoiidations: 

- Tile arrangement should ensure good u-v after ro- 
tation synthesis; we found both a monolithic arrangement 
and an arrangement to work well with respect to fore- 
ground subtraction. 

- Rotation synthesis is important, but once an array 
has achieved 3 to 4 hours of integration, the advantage 
gained from further w-u coverage is minimal. 

- Time between snapshots taken of the sky should 
be made as short as possible. Ideally, one should have 
continuous u-v coverage, as is planned for experiments 
such as the MWA. 

• Data reduction/algorithmic recommendations: 

- Uniform weighting of the u-v plane outperforms 
"natural weighting" . 

- Adjusting for changes in the primary beam 
with frequency through an extra convolution kernel in 
u-v space results in a modest improvement in foreground 
subtraction. However, this conclusion may change if the 
primary beam has sidelobes. 

- Adjusting for changes in the synthesized beam 
with frequency appears not to be worthwhile. 

- Quadratic fits across a few MHz arc adequate for 
subtracting off spectrally smooth foreground components 
from the data as long as bright point sources can be re- 
solved and removed down to the 10 mjy level — otherwise 
higher orders should be considered. 

- The frequency range over which the polynomial 
fitting is performed should be narrow enough to allow 
one to see cosmic signal, but broad enough to prevent 
over-fitting which excessively removes cosmic signal. For 
quadratic fits, a range of a few MHz appears appropriate. 

- The ability to remove bright point sources 
prior to the subtraction of unresolved point sources is 
crucial, with the corresponding flux cut Scut being the 
most important parameter of all in our analysis. If one 
can safely fit a cubic to the spectra without loosing too 
much cosmic signal, then one needs to be able to remove 
resolved points sources of flux down to 100 mJy. 

Our results were deliberately conservative, obtained 
with an extremely simple line-of-sight cleaning procedure, 
and it is likely that one can do better. An interesting ques- 
tion for future work is determining the optimal procedure. 
For example, other classes of fitting functions may be worth 
considering, and it may be better to replace multiple poly- 



nomial fits in disjoint bands by a single fit to a function with 
more parameters, say a spline. Such work should quantify 
the extent to which one is removing power from the cos- 
mological signal, and optimize the tradeoff between more 
foreground removal and less signal removal. Such an opti- 
mization should ideally be done using a more complete sky 
model, including diffuse Galactic synchrotron radiation. 

For now, however, what is reassuring for the field of 21- 
cm tomography is the fact that all of the recommendations 
listed above can be followed without any substantial revi- 
sions to current experimental designs. The papers mentioned 
in the introduction have already shown that neither noise 
nor diffuse Galactic foreground emission appear to pose in- 
surmountable obstacles to 21 cm cosmology. Our results on 
point source cleaning suggest that arguably the most wor- 
risome remaining foreground problem is also surmountable, 
allowing 21 cm cosmology to live up to its great potential. 
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